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INTRODUCTION 

This is a time of explosive growth in 
the fields of evolutionary and population 
genetics, with whole genome sequencing 
and bioinformatics driving a transforma- 
tive paradigm shift (Morozova and Marra, 
2008). At the same time, advances in 
epigenetics are thoroughly transforming 
our understanding of evolutionary pro- 
cesses and their impHcations for popula- 
tions, species and communities (Callinan 
and Feinberg, 2006). These revolutionary 
changes present tremendous opportuni- 
ties and challenges to our field (Table 1). 
In this essay, I will lay out my personal 
interpretation of what some of the biggest 
opportunities and challenges are for evo- 
lutionary and population genetics over the 
next decade. I believe that for our field to 
take full advantage of these tremendous 
opportunities, we must effectively com- 
bine genomics, epigenetics, bioinformat- 
ics, experiments and modeling (Figure 1). 
Genomic pipelines are rapidly producing 
intractably large volumes of data (e.g., 
Griffiths- Jones et al., 2008), often with- 
out sufficient forethought about what the 
data will be used for, or how it will 
be curated, archived, and analyzed. We 
would be well served by thinking carefully 
in advance about hypotheses, what data 
would be best suited to address them, what 
experiments could be designed to evalu- 
ate and validate results, and how power- 
ful modeling approaches could be coupled 
with experimentation and data mining 
to generalize experimental results and 
explore their implications across scales of 
biological organization from nucleotides 
to ecosystems. 



Today our field is justifiably obsessed 
with the explosive emergence of vast 
genomic data sets and the opportunities 
of integrating epigenetics with genomics 
to explore epigenomic patterns of gene 
regulation (Griffiths- Jones et al., 2008; 
Suzuki and Bird, 2008). However, the 
informatics challenges that attend this 
emergence are often not fully appreciated. 
The sheer vastness of genomic data sets 
is spectacular to contemplate, and can be 
terrifying to witness. Scientists are often 
overwhelmed and drown in the vastness of 
these emerging data. More data does not 
necessarily lead to better understanding. 
Often the flood of data can force scien- 
tists to focus on data storage and curation 
to such a degree that they completely lose 
sight of hypotheses about relationships 
between these data and biological process, 
how the patterns we observe in these vast 
data sets can be tested through controlled 
and replicated experimentation, and how 
the results can be generalized across scale 
through simulation and modeling. 

INTERSECTION OF GENOMICS AND 

EPIGENETICS WITH 

EXPERIMENTATION 

DATA WITHOUT EXPERIMENTS AND 

EXPERIMENTS WITHOUT APPROPRIATE DATA 

ARE BOTH EQUIVOCAL 

Experimentation has always been one of 
the most effective means to achieve reli- 
able knowledge (Wright, 1984). Through 
replication and control, variation can be 
quantified and accounted for and spurious 
effects can be removed, leading to reli- 
able inferences about drivers and strong 
tests of hypotheses. In our field, linking 



common garden experiments (Whitham 
et al., 2006) with genomic and epige- 
nomic datasets presents a tremendous 
opportunity to advance understanding of 
genetic controls on phenotype and fit- 
ness (Figure lA). In the sea of genomic 
data, it is all too tempting to use data 
mining techniques to seek correlations 
between genetic patterns and some pro- 
cess of interest. Finding such correlations 
suggests hypotheses, but does not provide 
a strong basis to evaluate whether these 
hypotheses may be true. The phenomenon 
of the under-determination of theories by 
facts suggests that there may be innu- 
merable ways in which a given observed 
pattern of genetic variation could have 
been produced in a population, and to 
avoid logical inferential errors of affirming 
the consequent it is absolutely essential to 
test hypothesis in controlled and replicated 
experiments, such as common gardens. 

INTERSECTION OF EXPERIMENTATION 
AND MODELING 

EXPERIMENTS WITHOUT MODELS ARE NOT 
EXTENSIBLE; MODELS WITHOUT 
EXPERIMENTS ARE NOT VERIFIABLE 

Experiments provide a powerful means to 
control one or a few processes hypoth- 
esized to drive genetic and epigenetic 
structure of populations (e.g., Kohler, 
1994). However, experiments necessarily 
are limited to a few interactions, at rela- 
tively small scales and over relatively short 
temporal extents. Simply put, experiments 
without models are not extensible, and 
models without experiments are not ver- 
ifiable (Figure IB). Simulation modeling 
provides tremendous abilities to explore 
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Table 1 | Grand challenges facing evolutionary and population genetics related to genomics, epigenomics, bioinformatics, modeling and 
experimentation, and their integration. 

GENOMIC, EPIGENOMIC, AND BIOINFORMATICS GRAND CHALLENGES J 

A Innproved efficiency and effectiveness of whole genonne sequencing and developnnent of broad libraries of genonnes of non-nnodel organisnns. 
B Innprovennents of fine-scale genetic nnapping to quantify patterns of genetic linkage across genonnes. 

C Innproved quantification, nneasurennent, and understanding of the genetic architecture and processes controlling heterosis, epistasis and 

pleiotropy, and other interactions between loci and alleles within the genonne. 
D Innproved understanding of the architecture and processes affecting heritable variation in gene activity not caused by changes in DNA sequence. 
E Understanding the architecture and processes driving changes of transcriptional potential of a cell. 
F Innproved understanding of causes and consequences of DNA nnethylation and histone nnodification. 

G Innproved understanding of interactions between genonnic variation and epigenetic processes, such as effects and heritability of repressor 

proteins attached to silencer regions of DNA. 
H Innproved nnethods and tools for organizing, analyzing, storing and retrieving vast genonnic, and epigenonnic datasets. 

MODELING GRAND CHALLENGES " 

A Developing connputationally efficient spatially explicit, individual based nnodels that sinnulate dispersal, nnating, genetic exchange, and nnortality as 
functions of cost-distance between individuals resulting fronn differential patterns of nnovennent in heterogeneous landscapes. 

B Incorporating selection into spatially explicit, individual based genetics nnodels, such that nnodels allow evaluation of differential patterns of 

selection across connplex fitness landscapes, and the interaction of differential patterns of gene flow with differential patterns of local selection. 

C Innproving how genonnic data are nnodeled in individual-based, spatially explicit gene flow and selection nnodels. 

D Using the innproved nnodels described in (A-C) to evaluate relationships between landscape resistance, landscape heterogeneity, population 
distribution and density and spatial patterns of allelic richness, heterozyosity, inbreeding coefficient, and effective population size. 

E Using the innproved nnodels described in (A-C) to evaluate tinne lags in the ennergence of genetic structure and equilibration of genetic diversity in 
spatially structured populations. 

F Using the innproved nnodels described in (A-C) to evaluate nnechanisnns for synnpatric and peripatric speciation as functions of restricted gene 
flow and differential local directional selection. 

G Using the innproved nnodels described in (A-C) to evaluate the interactive effects of landscape heterogeneity, landscape dynannics, and population 
dynannics on power of different statistical nnodeling approaches to reliably detect and predict changes in genetic diversity, population structure, 
and fitness in response to spatial patterns in the environnnent and fluctuations in population size and environnnental conditions. 

EXPERIMENTATION GRAND CHALLENGES ^^^^^^^^^^^^ii^i^^^^^^^^^^i^^ 

A Designing and innplennenting replicated connnnon garden experinnents in which genotypes collected fronn across broad environnnental gradients 
are reciprocally transplanted in replicated experinnental gardens that span the range of environnnental conditions in the field. 

B Incorporating nnulti-species, connnnunity-genetics designs into replicated connnnon gardens to evaluate the interactions between genetic 
characteristics of foundation species and the genetic characteristics and connposition of associated connnnunities. 

C Conducting long-ternn experinnents in which strength of selection is controlled to identify the genonnic and epigenonnic structure and processes 
underlying adaptation. 

D Conducting long-ternn experinnents in which rates of nnigration and strength of selection are controlled in a spatially structured environnnent to 
quantify interactions between gene flow, epigenetic processes and selection in influencing genetic diversity, fitness and reproductive isolation. 

E Conducting long-ternn experinnents in which species interactions, such as connpetition, connnnensalisnn and predation, are nnanipulated across 
gradients of differential gene flow and selection to understand how population process across nnulti-species connnnunities interact to drive 
evolution of the individual species. 

GRAND CHALLENGES INVOLVING THE COMBINATION OF MODELING WITH GENOMICS/EPIGENOMICS/BIOINFORMATICS VHB^^H 

A Connbining sinnulation nnodeling and genonnic data to better understand processes of non-additive gene interaction, such as epistasis and 
polygenic effects on phenotype, and how they influence evolution across connplex spatially heterogeneous adaptive landscapes. 

B Connbining sinnulation nnodeling and genonnic/epigenonnic data to better understand the causes and consequences of pleiotropy in natural and 
sinnulated populations, specifically how spatial and tennporal fluctuations in heterogenous adaptive landscapes nnay affect the outconne of fitness 
tradeoffs of pleiotropic effects in ternns of patterns of gene frequency across a spatially structured population. 

C Using sinnulation nnodeling to evaluate the evolutionary influences of epigenonnic processes, such as DNA nnethylation, histone nnodification and 
repressor proteins, in spatially connplex and tennporally varying environnnents and in nnulti-species interactions. 

D Using innproved understanding of genonnic and epigenonnic architecture to innprove realisnn and usefulness of spatially explicit, individual-based 
sinnulation nnodels. 

E Using sinnulation nnodels to evaluate what kinds of genonnic and epigenonnic data to produce for a given research objective, in ternns of what kinds 

of nnarkers, how nnany nnarkers, fronn what parts of the genonne, fronn how nnany individuals, and fronn which locations across the population. 
GRAND CHALLENGES INVOLVING THE COMBINATION OF EXPERIMENTATION WITH GENOMICS/EPIGENOMICS/BIOINFORMATICS 

A Designing controlled and replicated experinnents to test hypotheses about gene linkage, epistasis, pleiotropy, and polygenic effects on fitness. 
B Designing controlled and replicated experinnents to test hypotheses about heritable variation gene activity that is not caused by changes in DNA 
sequence. 

(Continued) 
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Table 1 | Continued 

C Designing controlled and replicated experinnents that are able to separate and quantify the relative effects and interactions of genonnic and 

epigenonnic processes in driving evolution. 
D Using infornnation fronn whole genonne scans and gene nnapping to infornn experinnents as to what loci and what nnarkers to include as response 

factors in experinnents that nnanipulate selection gradients and species interactions. 
GRAND CHALLENGES INVOLVING THE COMBINATION OF MODELING AND EXPERIMENTATION ] 
A Using sinnulation nnodeling to evaluate alternative experinnental designs in ternns of tradeoffs in sannple size, experinnental connplexity, variance 

and effect sizes to infornn design of optinnal experinnents. 
B Using experinnents to confirnn and validate predictions of sinnulation nnodeling. 

C Using sinnulation nnodeling to generalize experinnental findings by evaluating potential outconnes of identified processes in novel conditions, 
heterogeneous landscapes, fluctuating environnnents, and across broad ranges of spatial and tennporal scale. 

GRAND CHALLENGES INVOLVING THE COMBINATION OF MODELING, EXPERIMENTATION, AND GENOMICS/EPIGENOMICS/ 
BIOINFORMATICS 

A The greatest opportunities for advancing the fields of evolutionary and population genetics involve connbining nnodeling, experinnentation, 

genonnics, epigenonnics, and bioinfornnatics. 
B Connbining bioinfornnatics with nnodeling and experinnentation to link vast genonnic and epigenonnic databases to spatially explicit sinnulations 

which are then validated and calibrated by controlled and replicated nnanipulative experinnents. 
C Experinnents provide decisive proof of cause-effect relationships relating genonnic and epigenonnic variation to evolutionary and population 

genetic processes, while nnodeling allows exploration and generalization of the innplications of these relationships across scales in spatially 

connplex and tennporally varying conditions, such as predonninate in actual populations. 



the implications of alternative hypothe- 
ses (Epperson et al, 2010; Landguth and 
Cushman, 2010). This allows researchers 
to a priori evaluate sampling designs 
and analytical approaches to optimize 
the research to provide high power to 
evaluate hypotheses. In addition, simula- 
tion enables generalization of the results 
of experiments across scales to explore 
the implications of causal relationships 
in real populations. Ecosystems are the 
stage on which the play of evolution 
is acted, and ecosystems are complex, 
spatially structured, and temporally vary- 
ing. Our hypotheses typically focus on 
relatively simple relationships between 
mechanisms and responses, and our 
experiments to test them focus on these 
relationships at small scales over short 
time periods. Simulation modeling is crit- 
ical to explore how these pattern-process 
relationships propagate across scale and 
how variation in these processes across 
space and through time influences their 
outcomes. 

INTERSECTION OF GENOMICS AND 
EPIGENETICS WITH MODELING 
MODELS WITHOUT DATA ARE NOT 
COMPELLING; DATA WITHOUT MODELS ARE 
NOT INFORMATIVE 

Integration of genomic and epige- 
nomic datasets with genetic modeling 
is improving model testing, validation 
and calibration (Figure IC). Spatially 



explicit, individual based simulation mod- 
eling has advanced such that relationships 
between environmental characteristics, 
population structure and the genetic or 
epigenetic characteristics populations can 
be rigorously modeled (Landguth and 
Cushman, 2010; Landguth et al, 2012). 
The genomic revolution is providing 
researchers vast data sets comprising the 
genomic and epigenomic characteristics 
of many individuals which enables mod- 
els to be optimized to training datasets, 
and validated using independent test- 
ing data. Just as models without data 
are not compelling, data without models 
are not informative. Simulation model- 
ing has tremendous potential to quantify 
genetic processes, explore how they prop- 
agate across space and through time, 
and predict the effects of changes to the 
pattern-process relationship. 

PUniNGIT ALL TOGETHER 

In the sections above I described the chal- 
lenges and opportunities presented by the 
intersection of vast genomic and epige- 
nomic datasets, controlled genetic exper- 
iments, and simulation modeling. There 
is a synergistic dependence among these 
three fields in advancing evolutionary 
and population genomics. Data alone are 
not informative. Models alone are not 
compelling. Experiments alone are not 
generalizable. It is the intersection among 
these three different scientific endeavors 



that provides the best means of address- 
ing the most difficult and important chal- 
lenges in evolutionary and population 
genetics. 

Perhaps the best way to illustrate this 
synergy is through an example of how 
the three-way intersection of genomic 
data, controlled experiments and simu- 
lation modeling might be implemented 
(Figure ID). For the sake of illustration, 
our task is to predict the effects of climate 
change on gene flow and adaptive evolu- 
tion of a population across a large geo- 
graphical extent. We hypothesize that gene 
flow of this species would be affected by 
distance among individuals, and that the 
population would be differentially adapted 
to environmental conditions across the 
range (e.g., Landguth et al., 2012). With 
this hypothesis we might begin by con- 
ducting a large simulation experiment to 
evaluate how different degrees of gene 
flow and different strengths of selection 
across environmental gradients would be 
expected to affect genetic characteristics 
of the population. The results of these 
simulations would guide sampling design 
to detect the hypothesized relationship 
across a reasonable range of effect size 
and variability. We would implement this 
sampling regime, producing a large spa- 
tially referenced genomic or epigenomic 
data set. 

Next, we would calibrate, optimize 
and validate models predicting the 
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FIGURE 1 I Schematic showing three major branches of evolutionary and population 
genetics addressed in this essay. Bioinfornnatics work to develop, curate, archive and analyze 
genonnic and epigenonnic data, nnodeling of the influences of population processes on genonnic and 
epigenomic patterns within populations and controlled and replicated experinnents to test 
hypothesized relationships are all critical to advancing our field. The greatest challenges and 
opportunities lie in the intersections annong these three branches of research. (A) Bioinfornnatics 
should infornn design of experinnents, and results of genetic experinnents should guide what 
genonnic and epigenonnic data are collected for a given research effort. (B) Modeling should guide 
design of experinnents and generalize and extend experinnental results to explore innplications of 
pattern-process relationships across scale, and experinnents should guide the paranneterization and 
calibration of nnodels. (C) Bioinfornnatics should provide nnodelers with genonnic and epigenonnic 
data appropriate for nnodel developnnent, calibration, optinnization and validation, while nnodels 
should infornn bioinfornnaticians as to which genonnic and epigenonnic data is nnost relevant for a 
particular question. The three way intersection of bioinfornnatics, nnodeling, and experinnentation 
(D) provides the strongest potential synergy to advance evolutionary and population genetics. 



population process from the pattern in 
the observed genomic or epigenomic data 
(e.g., Cushman et al, 2006). We then 
would use the simulations previously con- 
ducted to evaluate how well the popu- 
lation process inferred from the empir- 
ical data optimization matches the pat- 
terns produced through simulation of the 
same processes (e.g., Shirk et al, 2012). 
This is a different example of synergy 
between modeling and genomic data, one 
in which the data are used to infer a pop- 
ulation process and simulations are used 
to evaluate how well that inferred pro- 
cess can explain the observed data. This 
combination of empirical modeling and 
simulation modeling would identify the 
most supported candidate models explain- 
ing influences of gene flow and selec- 
tion on the genetic characteristics of the 
population. 

A controlled and replicated experi- 
ment would then be designed to eval- 
uate the working hypothesis developed 
through empirical analysis and simula- 
tion (e.g., Whitham et al, 2006). This 
would involve synergy between genomics 



and experimentation and between model- 
ing and experimentation. First, we would 
design the experiment to control the fac- 
tors identified through the simulation 
modeling and empirical optimization to 
be the putative drivers of observed pat- 
terns of genetic variation. For example, 
if we found one pattern of genomic 
or epigenomic variation was associated 
with warm and dry climatic conditions, 
while another was associated with cold 
and wet conditions, we could construct 
a network of experimental common gar- 
dens replicated across the climate gradi- 
ent. We could use simulation modeling 
to evaluate how many individuals would 
be needed in each garden and how many 
gardens would be needed to provide high 
power to detect the inferred fitness rela- 
tionship, if present. There would also be 
synergy between the genomic data and 
experimental design; we would recipro- 
cally grow the genotypes that evinced 
the nonrandom patterns of apparent gene 
flow and selection. This is an example of 
synergy between genomic sampling and 
experimental design, in which patterns of 



genomic data across the putative selection 
gradients are then used to select individ- 
uals expressing those genetic characteris- 
tics for reciprocal transplanting across the 
experimental garden network. Designing 
the experimental garden to replicate and 
control the factors identified by model- 
ing as likely drivers of the genetic pat- 
terns in the population, and reciprocally 
transplanting in all gardens the genotypes 
which were found to be non-randomly 
associated with the putative selection gra- 
dients, would provide a rigorous basis of 
evaluating whether the processes inferred 
from empirical optimization and sim- 
ulation actually are responsible for the 
observed genomic or epigenomic structure 
of the population. 

Suppose the reciprocally transplanted 
common garden experiment confirmed 
the hypothesized relationship between cli- 
mate and the pattern of genomic or epige- 
nomic variation across the population. We 
would then explore the implications of the 
identified process through further simula- 
tion modeling. We might want to simulate 
what the expected genetic structure and 
allele frequency would be, given the pro- 
cess, in different study areas than were 
used to build the original model. We then 
could test the model further by sampling 
additional genomic samples from this new 
study area and evaluating how well the 
observed genetic structure and allele fre- 
quencies match that predicted by simu- 
lations based on the process identified in 
the experiments. This is an example of a 
three way synergy between genomic data, 
modeling and experimentation. 

CONCLUSION 

Perhaps the greatest challenge facing 
the fields of evolutionary and popula- 
tion genetics today is to produce, pro- 
cess, curate, archive and analyze immense 
genomic datasets in a way such that 
research is led by a priori hypothe- 
ses, integrated with powerful modeling, 
and, where possible, linked to repli- 
cated and controlled experiments to test 
putative relationships between population 
processes and evolutionary and popula- 
tion genetic responses. No single person 
has the expertise or the time to effec- 
tively bring these components together. 
More than ever, success in advancing our 
field will depend on collaborations across 
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large multi- disciplinary groups. Experts in 
the development of genomic and epige- 
nomic data are needed to produce the 
raw genomic data for subsequent analy- 
sis. Bioinformatics specialists are needed 
to provide programming and computer 
science expertise to efficiently process, 
curate, archive and analyze vast genomic 
datasets, and to effectively utilize high per- 
formance computing resources. Modelers 
will be needed to work with the bioin- 
formaticians to explore the implications 
of hypotheses a priori, to refine hypothe- 
ses by optimizing fit to observed data, 
and predict how observed pattern process 
relationships may propagate across scale 
through space and time. Experimenters 
should work closely with modelers to rig- 
orously test hypotheses in controlled and 
replicated experiments. To be successful 
this entire integration should be led by the- 
oreticians who have a coherent vision for 
how each of these parts will synergize to 
address focused and falsifiable questions of 
importance in advancing the field. 
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